Skip to content

feat: web search for Claude models via GPT /responses backend#273

Merged
caozhiyuan merged 3 commits into
caozhiyuan:devfrom
mahoshojoHCG:feat/claude-web-search
Jun 8, 2026
Merged

feat: web search for Claude models via GPT /responses backend#273
caozhiyuan merged 3 commits into
caozhiyuan:devfrom
mahoshojoHCG:feat/claude-web-search

Conversation

@mahoshojoHCG

Copy link
Copy Markdown
Contributor

Problem

GitHub Copilot offers no native web search path for Claude models:

  • Copilot's Anthropic-native /v1/messages endpoint rejects Anthropic's server-side web search tool: 400 — "The use of the web search tool is not supported." (unsupported_value) — for every anthropic-beta variant and both web_search_20250305 / web_search_20260209 tool types.
  • Claude models cannot use the /responses endpoint at all (supported_endpoints is only ["/v1/messages","/chat/completions"]; a direct call returns unsupported_api_for_model).
  • Web search only works through /responses, and only for GPT models (confirmed working).

So a /v1/messages request that carries a web_search server tool currently 400s upstream.

Solution

Fulfill the web search inside the proxy. When a Claude (Messages API) request carries an Anthropic web_search server tool:

  1. The server tool is swapped for an equivalent function tool Copilot's Claude will call.
  2. An agentic loop runs against Copilot's /v1/messages; each web_search call the model makes is answered by Copilot's own GPT /responses web_search tool (no external API key, uses existing Copilot quota).
  3. The searches + answer are reconstructed into native Anthropic server_tool_use + web_search_tool_result blocks plus the final cited text, so clients render it exactly like real Anthropic web search.

Both streaming and non-streaming are supported (streaming replays the reconstructed response as a synthetic Anthropic SSE sequence).

Config

  • useMessagesApiWebSearch (default true) — enable the feature. When off, the server tool is stripped so requests no longer 400.
  • webSearchBackendModel (default "gpt-5-mini") — GPT model used as the search backend.

Notes / limitations

  • Mixed assistant turns (a web_search call alongside a real client tool in the same response) terminate the loop and hand control back to the client.
  • Streaming buffers the loop then replays as SSE — there is inherent latency while searches run, so token-by-token streaming of the final answer is not preserved.

Testing

  • New unit tests in tests/web-search-fulfill.test.ts (detection/stripping, fulfillment loop, graceful backend-error block, synthetic stream event sequence).
  • bun run typecheck, bun run lint, and full bun test (281 pass) all green.
  • Live-verified end to end against Copilot: claude-sonnet-4.5 invoked web_search 3×, each fulfilled via gpt-5-mini /responses, producing native server_tool_use / web_search_tool_result blocks, a cited answer, stop_reason: end_turn, and usage.server_tool_use.web_search_requests: 3.

🤖 Generated with Claude Code

…kend

GitHub Copilot rejects Anthropic's native server-side web_search tool on
both its /v1/messages and /chat/completions endpoints, and Claude models
cannot use the /responses endpoint at all — so Copilot offers no native
web search path for Claude.

This fulfills the web_search tool inside the proxy: when a Claude
(Messages API) request carries an Anthropic web_search server tool, it is
swapped for an equivalent function tool and run through an agentic loop
against Copilot's /v1/messages. Each search the model requests is answered
by Copilot's own GPT /responses web_search tool, and the results are
reconstructed into native Anthropic server_tool_use + web_search_tool_result
blocks (plus a final cited answer). Both streaming and non-streaming are
supported (streaming replays the reconstructed response as a synthetic SSE
sequence).

Gated by the new useMessagesApiWebSearch config flag (default on); the
backend GPT model is configurable via webSearchBackendModel (default
gpt-5-mini). When disabled, the server tool is stripped so requests no
longer 400 upstream.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@caozhiyuan

Copy link
Copy Markdown
Owner

Install mcp fetch or other search tools. @mahoshojoHCG

@caozhiyuan caozhiyuan closed this Jun 6, 2026
@caozhiyuan caozhiyuan reopened this Jun 7, 2026
@caozhiyuan

caozhiyuan commented Jun 7, 2026

Copy link
Copy Markdown
Owner

@mahoshojoHCG In Claude Code, when using the WebSearch Tool to invoke a sub-model to initiate a web search, can tools that only use the websearch type be switched to the responses API or message API? Some providers' message APIs also support websearch. There are no plans to support mixing websearch-type tools with other tools, Can you adjust the code? For the switch, configuring a messageApiWebSearchModel should be sufficient.

Zhaoyu Yin and others added 2 commits June 8, 2026 13:48
Per review feedback, replace the Claude agentic loop with a model switch.
A Claude /v1/messages request whose only tool is web_search is now switched
to a configured Responses-capable GPT model (messageApiWebSearchModel,
default gpt-5-mini), which runs Copilot's native /responses web_search in a
single call. The result is reconstructed into native Anthropic
server_tool_use + web_search_tool_result + answer blocks (streaming replays
them as a synthetic SSE sequence).

Mixing web_search with other tools is intentionally unsupported: such
requests have the web_search tool stripped and proceed normally. The feature
is driven entirely by messageApiWebSearchModel (unset = disabled); the
previous useMessagesApiWebSearch / webSearchBackendModel flags are removed.

This avoids multiple Claude round-trips (cheaper, simpler) while keeping the
client-facing behavior identical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
messageApiWebSearchModel now also accepts a `provider/model` alias. When it
points to a provider whose message API supports web search natively (e.g. an
`anthropic`-type provider), the web_search tool is passed straight through to
that provider — no translation, native results. A plain Copilot GPT model
still goes through the /responses web_search path. Both return native
Anthropic web-search blocks, so behavior is consistent across targets.

The routing decision is extracted into a pure, unit-tested resolveWebSearchRoute
helper (provider / responses / strip).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mahoshojoHCG

Copy link
Copy Markdown
Contributor Author

@caozhiyuan Done — pushed two follow-up commits.

1. Both switch targets supported, via the one messageApiWebSearchModel:

  • a provider/model alias → the web_search tool is passed straight through to
    that provider's message API (for an anthropic-type provider this is true
    native web search, zero translation);
  • a Copilot GPT model → handled via Copilot's /responses web_search.

The routing decision is a small pure helper (resolveWebSearchRoute
provider / responses / strip) with unit tests, so it's easy to follow.

2. Result fidelity — kept the structured blocks. Both paths return native
Anthropic server_tool_use + web_search_tool_result + answer, so the shape is
consistent regardless of which target you configure (and explicit source URLs
survive rather than being buried in markdown).

Mixing stays unsupported: a web_search request with any other tool just has
the tool stripped and proceeds normally — only web-search-only requests switch.

typecheck / lint / full suite (287) all pass.

@caozhiyuan caozhiyuan merged commit 0ead4ad into caozhiyuan:dev Jun 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants